11 research outputs found

    Synthesizing Generic Experimental Environments for Simulation

    Get PDF
    International audienceExperiments play an important role in parallel and distributed computing. Simulation is a common experimental technique that relies on abstractions of the tested application and execution environment but offers reproducibility of results and fast exploration of numerous scenarios. This article focuses on setting up the experimental environment of a simulation run. First we analyze the requirements expressed by different research communities. As the existing tools of the literature are too specific, we then propose a more generic experimental environment synthesizer called SIMULACRUM. This tool allows its users to select a model of a currently deployed computing grid or generate a random environment. Then the user can extract a subset of it that fulfills his/her requirements. Finally the user can export the corresponding XML representation

    Towards Scalable, Accurate, and Usable Simulations of Distributed Applications and Systems

    Get PDF
    The study of parallel and distributed applications and platforms, whether in the cluster, grid, peer-to-peer, volunteer, or cloud computing domain, often mandates empirical evaluation of proposed algorithm and system solutions via simulation. Unlike direct experimentation via an application deployment on a real-world testbed, simulation enables fully repeatable and configurable experiments that can often be conducted quickly for arbitrary hypothetical scenarios. In spite of these promises, current simulation practice is often not conducive to obtaining scientifically sound results. State-of-the-art simulators are often not validated and their accuracy is unknown. Furthermore, due to the lack of accepted simulation frameworks and of transparent simulation methodologies, published simulation results are rarely reproducible. We highlight recent advances made in the context of the SimGrid simulation framework in a view to addressing this predicament across the aforementioned domains. These advances, which pertain both to science and engineering, together lead to unprecedented combinations of simulation accuracy and scalability, allowing the user to trade off one for the other. They also enhance simulation usability and reusability so as to promote an Open Science approach for simulation-based research in the field.L'Ă©tude de systĂšmes et applications parallĂšles et distribuĂ©s, qu'il s'agisse de clusters, de grilles, de systĂšmes pair-Ă -pair de volunteer computing, ou de cloud, demandent souvent l'Ă©valuation empirique par simulation des algorithmes et solutions proposĂ©s. Contrairement Ă  l'expĂ©rimentation directe par dĂ©ploiement d'applications sur des plates-formes rĂ©elles, la simulation permet des expĂ©riences reproductibles pouvant ĂȘtre menĂ©e rapidement sur n'importe quel scĂ©nario hypothĂ©tique. MalgrĂ© ces avantages thĂ©oriques, les pratiques actuelles en matiĂšre de simulation ne permettent souvent pas d'obtenir des rĂ©sultats scientifiquement Ă©prouvĂ©s. Les simulateurs classiques sont trop souvent validĂ©s et leur rĂ©alisme n'est pas dĂ©montrĂ©. De plus, le manque d'environnements de simulation communĂ©ment acceptĂ©s et de mĂ©thodologies classiques de simulation font que les rĂ©sultats publiĂ©s grĂące Ă  cette approche sont rarement reproductibles par la communautĂ©. Nous prĂ©sentons dans cet article les avancĂ©es rĂ©centes dans le contexte de l'environnement SimGrid pour rĂ©pondre Ă  ces difficultĂ©s. Ces avancĂ©es, comprenant Ă  la fois des aspects techniques et scientifiques, rendent possible une combinaison inĂ©galĂ©e de rĂ©alisme et prĂ©cision de simulation et d'extensibilitĂ©. Cela permet aux utilisateurs de choisir le grain des modĂšles utilisĂ©s pour ses simulations en fonction de ses besoins de rĂ©alisme et d'extensibilitĂ©. Les travaux prĂ©sentĂ©s ici amĂ©liorent Ă©galement l'utilisabilitĂ© et la rĂ©utilisabilitĂ© de façon Ă  promouvoir l'approche d'Open Science pour les recherches basĂ©es sur la simulation dans notre domaine

    MINTCar : A tool for multiple source multiple destination network tomography

    Get PDF
    International audienceIdentifying and inferring performances of a network topology is a well known problem. Achieving this by using only end-to-end measurements at the application level is known as network tomography. When the topology produced reflects capacities of sets of links with respect to a metric, the topology is called a Metric-Induced Network Topology (MINT). Tomography producing MINT has been widely used in order to predict performances of communications between clients and server. Nowadays grids connect up to thousands communicating resources that may interact in a partially or totally coordinated way. Consequently, applications running upon this kind of platform often involve massively concurrent bulk data transfers. This implies that the client/server model is no longer valid. In this paper, we present MINTCar, a tool which is able to discover metric induced network topology using only end-to-end measurements for paths that do not necessarily share neither a common source nor a common destination

    Toward a Formal Multiscale Architectural Framework for Emerging Properties Analysis in Systems of Systems

    No full text
    International audienceSystems of systems (SoSs) are composed of multiple operationally and managerially independent systems whose cooperation may lead to the apparition of emergent behaviors. Analysis of constituents and emergent properties is a cornerstone of SoSs. However, systems constituting a SoS may be arbitrarily complex. Therefore, precise modeling of SoSs may produce an enormous amount of information. Analyzing and modeling SoSs is thus a difficult task: how to conciliate complexity and provability of existence/absence of (emergent) properties ? Multiscale architecture modeling is appropriate for handling the inherent complexity of SoSs. Multiscale modeling enables to look at a problem simultaneously from different scales and different levels of detail. It takes advantage of data available at distinct scales, accordingly managing the complexity of behavior involved. Existing works regarding multiscale software architecture often specify a set of fixed views with loose definition of scales and scale dimensions, dramatically restricting scales usage. Furthermore, specification of scale changes have been mildly studied and are often handled as simple refinements. Yet, an adequate representation of model transformations is a key-factor for enabling system analysis. In this paper, we firstly present the formal definition of two scale dimensions: extend and grain. Extend allows to flexibly consider various subsystems, while grain specify different level of details. We formally define scale changes in this context and study their impact on system (emergent) propertie

    An Autonomic Cloud Management System for Enforcing Security and Assurance Properties

    No full text
    International audienceEnforcing security properties in a Cloud is a difficult task, which requires expertise. However, it is not the only security-related challenge met by a company migrating to a Cloud environment. Indeed, the tenant must also have assurance that the requested security properties have effectively been enforced. Therefore, the Cloud provider has to offer a way of monitoring the security. In this paper, we present a solution to express the assurance properties based on the security requirements of the tenant and to deploy these assurance properties. First, we introduce a language that expresses the assurance based on the tenant's security requirements. Secondly, we propose an infrastructure that deploys the assurance in a Cloud environment. This solution aims to be easy to use: the assurance directly results from the high-level expression of the tenant's security requirements, and no additional action is needed from the tenant. Consequently, we address one of the greatest drawback of security and assurance - the complexity of their configuration - while providing a complete assurance mechanism

    Shortest Processing Time First and Hadoop

    Get PDF
    International audienceBig data has revealed itself as a powerful tool for many sectors ranging from science to business. Distributed data-parallel computing is then common nowadays: using a large number of computing and storage resources makes possible data processing of a yet unknown scale. But to develop large-scale distributed big data processing, one have to tackle many challenges. One of the most complex is scheduling. As it is known to be an optimal online scheduling policy when it comes to minimize the average flowtime, Shortest Processing Time First (SPT) is a classic scheduling policy used in many systems. We then decided to integrate this policy into Hadoop, a framework for big data processing, and realize an implementation prototype. This paper describes this integration, as well as tests results obtained on our testbed

    Scalable Multi-Purpose Network Representation for Large Scale Distributed System Simulation

    Get PDF
    International audienceConducting experiments in large-scale distributed systems is usually time-consuming and labor-intensive. Uncontrolled external load variation prevents to reproduce experiments and such systems are often not available to the purpose of research experiments, e.g., production or yet to deploy systems. Hence, many researchers in the area of distributed computing rely on simulation to perform their studies. However, the simulation of large-scale computing systems raises several scalability issues, in terms of speed and memory. Indeed, such systems now comprise millions of hosts interconnected through a complex network and run billions of processes. Hence, most simulators trade accuracy for speed and rely on very simple and easy to implement models. However, the assumptions underlying these models are often questionable, especially when it comes to network modeling. In this paper, we show that, despite a widespread belief in the community, achieving high scalability does not necessarily require to resort to overly simple models and ignore important phenomena. We show that relying on a modular and hierarchical platform representation, while taking advantage of regularity when possible, allows us to model systems such as data and computing centers, peer-to-peer networks, grids, or clouds in a scalable way. This approach has been integrated into the open-source SimGrid simulation toolkit. We show that our solution allows us to model such systems much more accurately than other state-of-the-art simulators without trading for simulation speed. SimGrid is even sometimes orders of magnitude faster.La rĂ©alisation d'expĂ©riences pour l'Ă©tude de systĂšmes de calcul distribuĂ©s Ă  grande Ă©chelle est gĂ©nĂ©ralement dĂ©licate et trĂšs consommatrice de temps. Les variations non contrĂŽlĂ©es de la charge externe empĂȘchent de reproduire les expĂ©riences et de tels systĂšmes (par exemple dans le cas de plates-formes de production ou bien de systĂšmes en cours de conception) ne sont gĂ©nĂ©ralement pas disponibles pour la conduite d'expĂ©riences Ă  des fins de recherche en informatique. C'est pourquoi de nombreux chercheurs dans le domaine du calcul distribuĂ© basent leurs Ă©tudes sur des simulations. Cependant, la simulation d'un systĂšme de calcul distribuĂ© Ă  grande Ă©chelle soulĂšve Ă  son tour de nombreuses difficultĂ©s, notamment en terme de vitesse et d'espace mĂ©moire. En effet, de tels systĂšmes sont couramment constituĂ©s de millions d'hĂŽtes interconnectĂ©s par un rĂ©seau complexe et sur lesquels s'exĂ©cutent des milliards de processus. La plupart des simulations troquent de la prĂ©cision pour de la vitesse et se reposent sur des modĂšles simplistes et qui peuvent ĂȘtre mis en oeuvre trĂšs efficacement. NĂ©anmoins les hypothĂšses sous-jacentes Ă  ces modĂšles sont souvent trĂšs discutables, en particulier en ce qui concerne la modĂ©lisation du rĂ©seau. Dans ce rapport, nous coupons court Ă  l'idĂ©e largement rĂ©pandue dans notre communautĂ© selon laquelle le passage Ă  l'Ă©chelle des simulation se ferait nĂ©cessairement en ayant recours Ă  des modĂšles extrĂȘmement simplistes et en ignorant des phĂ©nomĂšnes potentiellement importants. Nous montrons qu'en utilisant une reprĂ©sentation modulaire et hiĂ©rarchique de la plate-forme tout en tirant parti de ses rĂ©gularitĂ©s quand elles sont prĂ©sentes, il est possible de simuler efficacement tout aussi bien des systĂšmes tels que des centres de calculs ou de donnĂ©es que des rĂ©seaux pair-Ă -pair, des grilles ou des clouds. Cette approche a Ă©tĂ© intĂ©grĂ©e Ă  l'outil de simulation open-source SimGrid. Nous montrons que notre solution nous permet de modĂ©liser de tels systĂšmes bien plus prĂ©cisĂ©ment que les autres simulateurs du domaine sans perdre en vitesse de simulation. SimGrid est mĂȘme dans certaines simulations plusieurs ordres de grandeur plus rapide
    corecore